Rule Pack Reference
Overview
Rule Packs are XML files that contain one or more classification rules along with their metadata and shared resources. They serve as the deployment unit for content detection rules and provide versioning, localization, and management capabilities.
Rule Pack Structure
Basic Structure
<?xml version="1.0" encoding="UTF-8"?>
<RulePackage xmlns="http://schemas.microsoft.com/office/2011/mce">
<RulePack id="unique-id">
<Version major="1" minor="0" build="0" revision="0"/>
<Publisher id="publisher-id"/>
<Details defaultLangCode="en">
<LocalizedDetails langcode="en">
<PublisherName>Publisher Name</PublisherName>
<Name>Rule Pack Name</Name>
<Description>Rule pack description</Description>
</LocalizedDetails>
</Details>
<!-- Rules Section -->
<Rules>
<!-- Classification rules go here -->
</Rules>
<!-- Resources Section (Optional) -->
<Resources>
<!-- Shared resources go here -->
</Resources>
</RulePack>
</RulePackage>
Rule Pack Metadata
Required Attributes
| Attribute | Type | Description | Required | Constraints |
|---|---|---|---|---|
id | String | Unique identifier for the rule pack | Yes | Must be unique across all rule packs |
xmlns | String | XML namespace declaration | Yes | Must be http://schemas.microsoft.com/office/2011/mce |
Version Element
The Version element specifies the rule pack version for management and updates.
| Attribute | Type | Description | Required | Range | Example |
|---|---|---|---|---|---|
major | Integer | Major version number | Yes | 0-999 | 1 |
minor | Integer | Minor version number | Yes | 0-999 | 0 |
build | Integer | Build number | Yes | 0-9999 | 0 |
revision | Integer | Revision number | Yes | 0-9999 | 0 |
Example:
<Version major="2" minor="1" build="15" revision="3"/>
Publisher Element
| Attribute | Type | Description | Required | Example |
|---|---|---|---|---|
id | String | Unique publisher identifier | Yes | "company-security-team" |
Example:
<Publisher id="cyberhaven-content-rules"/>
Details Element
The Details element contains localized metadata about the rule pack.
| Attribute | Type | Description | Required | Values |
|---|---|---|---|---|
defaultLangCode | String | Default language code | Yes | ISO 639-1 codes (en, fr, de, etc.) |
LocalizedDetails Sub-elements
| Element | Type | Description | Required | Max Length |
|---|---|---|---|---|
PublisherName | String | Display name of the publisher | Yes | 256 characters |
Name | String | Display name of the rule pack | Yes | 256 characters |
Description | String | Description of the rule pack | Yes | 1024 characters |
Example:
<Details defaultLangCode="en">
<LocalizedDetails langcode="en">
<PublisherName>Cyberhaven Security Team</PublisherName>
<Name>Financial Data Protection Rules</Name>
<Description>Classification rules for detecting financial and payment card data</Description>
</LocalizedDetails>
<LocalizedDetails langcode="fr">
<PublisherName>Équipe de sécurité Cyberhaven</PublisherName>
<Name>Règles de protection des données financières</Name>
<Description>Règles de classification pour détecter les données financières et de cartes de paiement</Description>
</LocalizedDetails>
</Details>
Rules Section
The Rules section contains all classification rules included in the pack.
Structure
<Rules>
<Entity id="rule-id-1" patternsProximity="300" recommendedConfidence="75">
<!-- Rule definition -->
</Entity>
<Entity id="rule-id-2" patternsProximity="150" recommendedConfidence="85">
<!-- Rule definition -->
</Entity>
<!-- Additional rules... -->
</Rules>
Rule Container Attributes
| Attribute | Type | Description | Required | Default | Range |
|---|---|---|---|---|---|
id | String | Unique rule identifier | Yes | - | Must be unique within rule pack |
patternsProximity | Integer | Maximum distance between patterns | No | 300 | 1-1000 characters |
recommendedConfidence | Integer | Recommended confidence threshold | No | 75 | 1-100 |
Resources Section
The Resources section contains shared resources that can be referenced by multiple rules.
Supported Resource Types
Keyword Lists
<Resources>
<Keyword id="financial-terms">
<Group matchStyle="word">
<Term>account</Term>
<Term>balance</Term>
<Term>payment</Term>
<Term>transaction</Term>
</Group>
</Keyword>
</Resources>
Keyword Attributes
| Attribute | Type | Description | Required | Values |
|---|---|---|---|---|
id | String | Unique keyword list identifier | Yes | Must be unique within rule pack |
matchStyle | String | How keywords should be matched | No | word, string, regex |
Keyword Group Attributes
| Attribute | Type | Description | Required | Values |
|---|---|---|---|---|
matchStyle | String | Override match style for this group | No | word, string, regex |
Regular Expression Patterns
<Resources>
<Regex id="credit-card-pattern">
<Pattern>(?:\d{4}[-\s]?){3}\d{4}</Pattern>
</Regex>
</Resources>
Built-in Functions
<Resources>
<LocalizedStrings>
<Resource idRef="creditCardValidation">
<Name default="true" langcode="en">Credit Card Validation</Name>
<Description default="true" langcode="en">Validates credit card numbers using Luhn algorithm</Description>
</Resource>
</LocalizedStrings>
</Resources>
Validation Requirements
XML Schema Validation
Rule packs must be valid XML using UTF-8 or UTF-16LE encoding and conform to the Microsoft Classification Engine schema.
Required Elements
- Root Element: Must be
RulePackagewith correct namespace - RulePack Element: Must contain
idattribute - Version Element: Must specify all four version components
- Publisher Element: Must contain valid
id - Details Element: Must contain at least one
LocalizedDetails - Rules Section: Must contain at least one rule
Naming Conventions
| Element | Convention | Example |
|---|---|---|
| Rule Pack ID | Lowercase with hyphens | financial-data-rules |
| Rule ID | Descriptive with context | credit-card-detection |
| Resource ID | Lowercase with hyphens | payment-keywords |
Deployment Considerations
Version Management
- Major Version: Increment for breaking changes
- Minor Version: Increment for new features
- Build: Increment for bug fixes
- Revision: Increment for patches
Performance Impact
| Factor | Impact | Recommendation |
|---|---|---|
| Number of Rules | High | Group related rules, limit to 50 per pack |
| Resource Size | Medium | Keep keyword lists under 1000 terms |
| Pattern Complexity | High | Use simple patterns when possible |
| Proximity Settings | Medium | Use appropriate proximity values |
Localization Support
<Details defaultLangCode="en">
<LocalizedDetails langcode="en">
<PublisherName>Security Team</PublisherName>
<Name>PII Detection Rules</Name>
<Description>Rules for detecting personally identifiable information</Description>
</LocalizedDetails>
<LocalizedDetails langcode="es">
<PublisherName>Equipo de Seguridad</PublisherName>
<Name>Reglas de Detección de PII</Name>
<Description>Reglas para detectar información de identificación personal</Description>
</LocalizedDetails>
</Details>
Example Rule Pack
<?xml version="1.0" encoding="UTF-8"?>
<RulePackage xmlns="http://schemas.microsoft.com/office/2011/mce">
<RulePack id="cyberhaven-financial-rules">
<Version major="1" minor="2" build="5" revision="0"/>
<Publisher id="cyberhaven-security"/>
<Details defaultLangCode="en">
<LocalizedDetails langcode="en">
<PublisherName>Cyberhaven Security</PublisherName>
<Name>Financial Data Protection</Name>
<Description>Comprehensive rules for detecting financial and payment information</Description>
</LocalizedDetails>
</Details>
<Rules>
<Entity id="credit-card-detection" patternsProximity="300" recommendedConfidence="85">
<!-- Credit card detection rule -->
</Entity>
<Entity id="bank-account-detection" patternsProximity="200" recommendedConfidence="75">
<!-- Bank account detection rule -->
</Entity>
</Rules>
<Resources>
<Keyword id="financial-keywords">
<Group matchStyle="word">
<Term>account</Term>
<Term>routing</Term>
<Term>balance</Term>
<Term>payment</Term>
</Group>
</Keyword>
</Resources>
</RulePack>
</RulePackage>
Best Practices
Organization
- Group Related Rules: Keep functionally related rules in the same pack
- Logical Naming: Use descriptive, consistent naming conventions
- Version Control: Maintain proper version numbering
- Documentation: Include comprehensive descriptions
Performance
- Optimize Proximity: Use appropriate proximity values for rule types
- Limit Complexity: Keep rule packs focused and manageable
- Resource Sharing: Use shared resources to reduce duplication
- Testing: Validate rules with representative content
Maintenance
- Regular Updates: Keep rules current with evolving threats
- Performance Monitoring: Track rule performance and accuracy
- Localization: Maintain translations for international deployments
- Backup: Maintain version history for rollback capability